Processing Conjunctive and Phrase Queries with the Set-Based Model

نویسندگان

  • Bruno Pôssas
  • Nivio Ziviani
  • Berthier A. Ribeiro-Neto
  • Wagner Meira
چکیده

The objective of this paper is to present an extension to the set-based model (SBM), which is an effective technique for computing term weights based on co-occurrence patterns, for processing conjunctive and phrase queries. The intuition that semantically related term occurrences often occur closer to each other is taken into consideration. The novelty is that all known approaches that account for co-occurrence patterns was initially designed for processing disjunctive (OR) queries, and our extension provides a simple, effective and efficient way to process conjunctive (AND) and phrase queries. This technique is time efficient and yet yields nice improvements in retrieval effectiveness. Experimental results show that our extension improves the average precision of the answer set for all collection evaluated, keeping computational cost small. For the TReC-8 collection, our extension led to a gain, relative to the standard vector space model, of 23.32% and 18.98% in average precision curves for conjunctive and phrase queries, respectively.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

South African Language Resources: Phrase Chunking

Phrase chunking remains an important natural language processing (NLP) technique for intermediate syntactic processing. This paper describes the development of protocols, annotated phrase chunking data sets and automatic phrase chunkers for ten South African languages. Various problems with adapting the existing annotation protocols of English are discussed as well as an overview of the annotat...

متن کامل

The Completeness Problem of Ordered Relational Databases

Support of order in query processing is a crucial component in relational database systems, not only because the output of a query is often required to be sorted in a specific order, but also because employing order properties can significantly reduce the query execution cost. Therefore, finding an effective approach to answer queries over ordered data is important to the efficiency of query pr...

متن کامل

Optimizing Reformulation-based Query Answering in RDF

Reformulation-based query answering is a query processing technique aiming at answering queries under constraints. It consists of reformulating the query based on the constraints, so that evaluating the reformulated query directly against the data (i.e., without considering any more the constraints) produces the correct answer set. In this paper, we consider optimizing reformulation-based query...

متن کامل

مدل جدیدی برای جستجوی عبارت بر اساس کمینه جابه‌جایی وزن‌دار

Finding high-quality web pages is one of the most important tasks of search engines. The relevance between the documents found and the query searched depends on the user observation and increases the complexity of ranking algorithms. The other issue is that users often explore just the first 10 to 20 results while millions of pages related to a query may exist. So search engines have to use sui...

متن کامل

Simulating and Optimizing the Conjunctive Use of Surface and Groundwater Resources Using the System Dynamics Approach (A Case Study: Dashte-Abbas Irrigation Network)

The construction of irrigation network and the water transfer from Karkheh Dam to Dashte-Abbas, due to neglecting the groundwater resources has increased groundwater level and waterlogging of the agricultural land in the recent years. The aim of this study was, therefore, to optimize the conjunctive use of surface and groundwater resources in Dashte-Abbas to minimize waterlogging problems and a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004